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Abstract 

In this paper, we consider a fitness-level model of a non-elitist mutation-only evolu¬ 
tionary algorithm (EA) with tournament selection. The model provides upper and lower 
bounds for the expected proportion of the individuals with fitness above given thresholds. 
In the case of so-called monotone mutation, the obtained bounds imply that increasing the 
tournament size improves the EA performance. As corollaries, we obtain an exponentially 
vanishing tail bound for the Randomized Local Search on unimodal functions and polyno¬ 
mial upper bounds on the runtime of EAs on 2-SAT problem and on a family of Set Cover 
problems proposed by E. Balas. 


1 Introduction 


Evolutionary algorithms are randomized heuristic algorithms employing a population of ten¬ 
tative solutions (individuals) and simulating an evolutionary type of search for optimal or 
near-optimal solutions by means of selection, crossover and mutation operators. The evolu¬ 
tionary algorithms with crossover operator are usually called genetic algorithms (GAs). Evo¬ 
lutionary algorithms in general have a more flexible outline and include genetic program¬ 
ming, evolution strategies, estimation of distribution algorithms and other evolution-inspired 
paradigms. Evolutionary algorithms are now frequently used in areas of operations research, 
engineering and artificial intelligence. 

Two major outlines of an evolutionary algorithm are the elitist evolutionary algorithm, that 
keeps a certain number of most promising individuals from the previous iteration, and the 
non-elitist evolutionary algorithm, that computes all individuals of a new population inde¬ 
pendently using the same randomized procedure. In this paper, we focus on the non-elitist 
case. 

One of the first theoretical results in the analysis of non-elitist GAs is Schemata Theo¬ 
rem ( [Goldberg 19891 which gives a lower bound on the expected number of individuals from 
some subsets of the search space (schemata) in the next generation, given the current popu¬ 
lation. A significant progress in understanding the dynamics of GAs with non-elitist outline 
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was made in (Vose, 19951 by means of dynamical systems. However most of the findings 
in (Vose 19951 apply to the infinite population case, and it is not clear how these results can 
be used to estimate the applicability of GAs to practical optimization problems. A theoretical 
possibility of constructing GAs that provably optimize an objective function with high prob¬ 
ability in polynomial time was shown in (Vitanyi, 20001 using rapidly mixing Markov chains. 
However (Vitanyi, 20001 provides only a very simple artificial example where this approach is 
applicable and further developments in this direction are not known to us. 

One of the standard approaches to studying evolutionary algorithms in general, is based on 
th e fitness levels (Wegener, 2002). In this approach, the solution space is partitioned into disjoint 
subsets, called fitness-levels, according to values of the fitness function. In (Lehre 201 If, the 
fitness-level approach was first applied to upper-bound the runtime of non-elitist mutation- 
only evolutionary algorithms. Here and below, by the runtime we mean the expected number 
of fitness evaluations made until an optimum is found for the first time. Upper bounds of the 


runtime of non-elitist GAs, involving the crossover operators, were obtained later in (Corns 
et al. 2014 Eremeev 20161. The runtime bounds presented in (|Corus et al. 2014 Lehre 201 If 


are based on the drift analysis. In (Moraglio and Sudholt 2015) , a runtime result is proposed 
for a class of convex search algorithms, including some non-elitist crossover-based GAs with¬ 
out mutation, on the so-called concave fitness landscapes. 

In this paper, we consider the non-elitist evolutionary algorithm which uses a tournament 
selection and a mutation operator but no crossover. The s-tournament selection randomly 
chooses s individuals from the existing population and selects the best one of them (see e.g. 
(Thierens and Goldberg 1994!)). The mutation operator is viewed as a randomized procedure, 
which computes one offspring with a probability distribution depending on the given parent 
individual. In this paper, evolutionary algorithms with such outline are denoted as EA. We 
study the probability distribution of the EA population w.r.t. a set of fitness levels. The es¬ 
timates of the EA behavior are based on a priori known parameters of a mutation operator. 
Using the proposed model we obtain upper and lower bounds on expected proportion of the 
individuals with fitness above certain thresholds. The lower bounds are formulated in terms 
of linear algebra and resemble the bound in Schemata Theorem (Goldberg 1989). Instead of 
schemata here we consider the sets of genotypes with the fitness bounded from below. Besides 
that, the bounds obtained in this paper may be applied recursively up to any given iteration. 

A particular attention in this paper is payed to a special case when mutation is monotone. 
Informally speaking, a mutation operator is monotone if throughout the search space the fol¬ 
lowing condition holds: the greater the fitness of a parent the "better" offspring distribution 
the mutation generates. One of the most well-known examples of monotone mutation is the 
bitwise mutation in the case of OneMax fitness function. As shown in (Borisovsky and Ere- 


meev} 2008}, in the case of monotone mutation, one of the most simple evolutionary algo¬ 


rithms, known as the (1+1) EA has the best-possible performance in terms of runtime and 
probability of finding the optimum. 

In the case of monotone mutation, the lower bounds on expected proportions of the indi¬ 
viduals turn into equalities for the trivial evolutionary algorithm (1,1) EA. This implies that 
the tournament selection at least has no negative effect on the EA performance in such a case. 
This observation is complemented by the asymptotic analysis of the EA with monotone muta¬ 
tion indicating that, given a sufficiently large population size and some technical conditions, 
increasing the tournament size s always improves the EA performance. 

As corollaries of the general lower bounds on expected proportions of sufficiently fit in¬ 
dividuals, we obtain polynomial upper bounds on the Randomized Local Search runtime on 
unimodal functions and upper bounds on runtime of EAs on 2-SAT problem and on a fam¬ 
ily of Set Cover problems proposed by Balas (1984). Unlike the upper bounds on runtime of 
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evolutionary algorithms with tournament selection from (|Corus et al. 2014 Eremeev 2016 


Lehre ]2011), which require sufficiently large tournament size, the upper bounds on runtime 


obtained here hold for any tournament size. 

The rest of the paper is organized as follows. In Section |2j we give a formal description 
of the considered EA, introduce an approximating model of the EA population and define 
some required parameters of the probability distribution of a mutation operator in terms of 
fitness levels. In Section[3] using the model from Section[2| we obtain lower and upper bounds 
on expected proportions of genotypes with fitness above some given thresholds. Section |4] is 
devoted to analysis of an important special case of monotone mutation operator, where the 
bounds obtained in the previous section become tight or asymptotically tight. In Section[5j we 
consider some illustrative examples of monotone mutation operators and demonstrate some 
applications of the general results from Section |3j In particular, in this section we obtain new 
lower bounds for probability to generate optimal genotypes at any given iteration t for a class 
of unimodal functions, for 2-SAT problem and for a family of set cover problems proposed by 
E. Balas (in the latter two cases we also obtain upper bounds on the runtime of the EA). Besides 
that in Section [5] we give an upper bound on expected proportion of optimal genotypes for 
OneMax fitness function. Section [6] contains concluding remarks. 

This work extends the conference paper (EremeevJ 2000). The extension consists in com¬ 
parison of the EA behavior to that of the (1,1) EA, the (1,A) EA and the (1+1) EA in Section[3] 
and in the new runtime bounds and tail bounds demonstrated in Section [5] The main results 
from the conference paper are refined and provided with more detailed proofs. 


2 Description of Algorithms and Approximating Model 

2.1 Notation and Algorithms 

Let the optimization problem consist in maximization of an objective function / on the set 
of feasible solutions Sol C X = {0,1}", where X is the search space of all binary strings of 
length n. 


The Evolutionary Algorithm EA. The EA searches for the optimal or sub-optimal so¬ 
lutions using a population of individuals, where each individual (genotype) g is a bit¬ 
string (g 1 , g 2 ,..., g n ), and its components g l £ {0,1}, * = 1,2,..., n, are called genes. 

In each iteration the EA constructs a new population on the basis of the previous one. The 
search process is guided by the values of a fitness function 


0(flO = 


f{g) if g £ Sol; 
r(g) otherwise. 


where r(-) is a penalty function. 

The individuals of the population may be ordered according to the sequence in which 
they are generated, thus the population may be considered as a vector of genotypes X 1 = 
(g \ t ' > , g!p ,..., gp), where A is the size of population, which is constant during the run of the EA, 
and t is the number of the current iteration. In this paper, we consider a non-elitist algorithmic 
outline, where all individuals of a new population are generated independently from each 
other with identical probability distribution depending on the existing population only. 

Each individual is generated through selection of a parent genotype by means of a selection 
operator, and modification of this genotype in mutation operator. During the mutation, a 
subset of genes in the genotype string g is randomly altered. In general the mutation operator 
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may be viewed as a random variable Mut(g) £ X with the probability distribution depending 
on g. 

The genotypes of the initial population X° are generated with some a priori chosen prob¬ 
ability distribution. The stopping criterion may be e.g. an upper bound on the number of 
iterations f max . The result is the best solution generated during the run. The EA has the fol¬ 
lowing scheme. 

1. Generate the initial population X () . 

2. For t := 0 to t max — 1 do 

2.1. For k := 1 to A do 

Choose a parent genotype g from X 1 by s-tournament selection. 

Add gl!' + i 1 = Mut(g) to the population X t+1 . 


In theoretical studies, the evolutionary algorithms are usually treated without a stopping 
criterion (see e.g. (Neumann and Witt, 20101). Unless otherwise stated, in the EA we will also 
assume that t max = oo. 

Note that in the special case of the EA with A = 1 we can assume that s = 1, since the 
tournament selection has no effect in this case. 


(1,A) EA and (1+1) EA. In the following sections we will also need a description of two simple 
evolutionary algorithms, known as the (1,A) EA and the (1+1) EA. 

The genotype of the current individual on iteration r of the (1,A) EA will be denoted by b^ T \ 
and in the (1+1) EA it will be denoted by x <Ti . The initial genotypes b'X and x^ 1 ’ are generated 
with some a priori chosen probability distribution. The only difference between the (1,A) EA 
and the (1+1) EA consists in the method of construction of an individual for iteration r+1 using 
the current individual of iteration r as a parent. In both algorithms the new individual is built 
with the help of a mutation operator, which we will denote by Mutk In case of the (1,A) EA, 
the mutation operator is independently applied A times to the parent genotype b iT] and out 
of A offspring a single genotype with the highest fitness value is chosen as b < - T+i ' > . (If there 
are several offspring with the highest fitness, the new individual &( T + 1 ) j s chosen arbitrarily 
among them.) In the (1+1) EA, the mutation operator is applied to x} J) once. If x = Mut^ah 1 ")) 
is such that <f>(x) > then x^ r+1 - ) := x ; otherwise x^ T+1 ' ) := xJ T K 


2.2 The Proposed Model 


The EA may be considered as a Markov chain in a number of ways. For example, the states 
of the chain may corresp ond to different vectors of A genotypes that constitute the population 
X* (see (Rudolph, 1994)). In this case the number of states in the Markov chain is 2 nX . Another 
model representing the GA as a Markov chain is proposed in (Nix and Vose 19921, where all 
populations which differ only in the ordering of individuals are considered to be equivalent. 
Each state of this Markov chain may be represented by a vector of 2" components, where the 
proportion of each genotype in the population is indicated by the corresponding coordinate 
and the total number of states is ( 2 + A A_1 ) • In the framework of this model, M.Vose and collab¬ 
orators have obtained a number of general results concerning the emergent behavior of GAs 
by linking these algorithms to the infinite-population GAs (Vose 19951. 

The major difficulties in application of the above mentioned models to the analysis of GAs 
for combinatorial optimization problems are connected with the necessity to use the high¬ 
grained information about fitness value of each genotype. In the present paper, we consider 
one of the ways to avoid these difficulties by means of grouping the genotypes into larger 
classes on the basis of their fitness. 
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Assume that <j>o := min{</>(g) : g £ X} and there are to level lines of the fitness function 
fixed such that do < 4>i < <j>i ■ ■ ■ < 0m ■ The number of levels and the fitness values corre¬ 
sponding to them may be chosen arbitrarily, but they should be relevant to the given problem 
and the mutation operator to yield a meaningful model. Let us introduce the sequence of 
Lebesgue subsets of X 

Hi := {g : <f>(g) > fa}, i = 0,...,m. 

Obviously, H 0 = X. For the sake of convenience, we define // m+1 := 0. Also, we denote the 
level sets At := //,\//,_i, i = 0 ,m which give a partition of X. Partitioning subsets A, 
are more frequently used in literature on level-based analysis, compared to the Lebesgue sub¬ 
sets Hi. In this paper we will frequently state that a genotype has a sufficiently high fitness, 
therefore the use of subsets Hi = U"i ): A, will be more convenient in such cases. One of the 
partitions used in the literature, called the canonical partition, defines f o, ■ ■ ■, </> m as the set of 
all fitness values on the search space X. 

Now suppose that for all i = 0,..., to and j = 1,m, the a priori lower bounds ay 7 and 
upper bounds fiij on mutation transition probabilities from subset A, to Hj are known, i.e. 

a-ij < Pr{Mut(g) € Hj} < /3ij for any g £ Aj. 

Fig.[l]illustrates the transitions considered in this expression. 



Figure 1: Transitions from A,; to Hj under mutation. 


Let A denote the matrix with the elements where i = 0,..., m, and j = 1,..., to. The 
similar matrix of upper bounds Bij is denoted by B. Let the population on iteration t be repre¬ 
sented by the population vector 


= 0 


A*)) 


where z'.p £ [0,1] is the proportion of genotypes from Hi in population A' 4 . The population 
vector zd) is a random vector, where zp > zp t for i = 1,..., to — 1 since // t+ -| C Hi. 

Let Pr{gh) g // ; } be the probability that an individual, which is added after selection and 
mutation into X 4 , has a genotype from Hj for j = 0,..., to, and t > 0. According to the scheme 
of the EA this probability is identical for all genotypes of A 4 , i.e. Pr{g (4 ^ £ Hj} = Pr{gj^ £ 

Hj} = - = MaP e Hj}. 


Proposition 1 E[^ ,) ] = Pr{</ 4 ) £ Hi} for all t > 0, i = 1, ...,m. 


Proof. Consider the sequence of identically distributed random variables £ j ; ft,,.... where 
Q = 1 if the Z-th individual in the population A 4 belongs to H ir otherwise Q = 0. By the 
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definition, z ( p = Ya= i$f*, consequently = E;=i E IC]/ A = EiU Pr {ff W £ Hi}/\ = 

Pr{ 5 W £ Hi}. □ 


Level-Based Mutation. If for some mutation operator there exist two equal matrices of lower 
and upper bounds A and B, i.e. Qy = /?„ for all i = 0,... ,m, j = 1,... ,m then the mu¬ 
tation operator will be called level-based. By this definition, in the case of level-based muta¬ 
tion, Pr{Mut(</) £ Hj} does not depend on a choice of genotype g £ A, and the probabilities 
7 ij = Pr{Mut(g) £ Hj |g £ Ai} are well-defined. In what follows, we call 7 ^ a cumulative tran¬ 
sition probability. The symbol P will denote the matrix of cumulative transition probabilities of 
a level-based mutation operator. 

If the EA uses a level-based mutation operator, then the probability distribution of popu¬ 
lation X t+l is completely determined by the vector z it! . In this case the EA may be viewed as 
a Markov chain with states corresponding to the elements of 

Z\ ■.= {z £ {0,1/A, 2/A,..., l} m : z t > z i+1 ,i = 1}, 


which is the set of all possible vectors of population of size A. Here and below, the symbol z is 
used to denote a vector from the set of all possible population vectors Z\. 

The cardinality of set Z\ may be evaluated analogously to the number of states in the 
model of Nix and Vose (1992). Now levels replace individual elements of the search space, 
which gives a total of possible population vectors. 


3 Bounds on Expected Proportions of Fit Individuals 

In this section, our aim is to obtain lower and upper bounds on E[z^] for arbitrary s and t. if 
the distribution of the initial population is known. 

Let P ch (S, z) denote the probability that the genotype, chosen by the tournament selection 
from a population with vector z, belongs to a subset S C X. Note that if the current population 
is represented by the vector z :t> = z, then a genotype obtained by selection and mutation 
would belong to Hj with a conditional probability 

m 

Pr{ ff (t+1) £ Hj\z (t) = z }=Y1 H Pr{ M ut(g) £ Hj\g}P ch ({g},z). (1) 

i =0 g£Ai 


3.1 Lower Bounds 

Expression l[Tj and the definitions of bounds a, :l yield for all j = 1 ,... ,m: 

m m 

Pr{g (t+1) £ Hj\z {t) =z}> ^2 a ij ^2 P ch({g },z) = 'Y^a ij P ch {A i , z), (2) 

i—0 gEAi 2=0 

which turns into an equality in the case of level-based mutation and A = T. 

Given a tournament size s we obtain the following selection probabilities: P c h(Hi, z^) = 
1 — (1 — zp) a , i = 1,.. ., m, and, consequently, P c h{Ai,z) = (1 — Zi+\) s — (1 — Zi) s . This leads 
to the inequality: 

m 

Pr{<?( t+1 ) £ Hj |z (t) = z} >^a i3 -(( 1 - z i+1 ) s - (1 - z,) 8 ). 

2 — 0 
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By the total probability formula, 

Pr{c/( t+1 ) G Hj} = Pr{ff (t+1) £ H 0 |z (t) = z} Pr{z (t) = z} (3) 


- -i+i) S - (! - 2i) S ) Pr{z (t) = z} 

zez x i=o 

m 

= Y / a ij E[(l-z$ i y-(l-z? ) y} 

i =0 

m 

= a mj E[(l - z^ +1 ) s ] - a 0j -E[(l - z^) 8 } - - Oi-i,j)E[(l - 4+i)1> ( 4 ) 

where the last expression is obtained by regrouping the summation terms. Proposition [l] im¬ 
plies that E[z^ t+1 ^] = Pr{f/h+ l ) g Hj}. Consequently, since (1 — z^ +1 ) s = 1 and (1 — z^Y = 0, 
expression Ell gives a lower bound 

m 

E[^ t+1) ] > otmj - Oi-r j)E[(l - z®)']. (5) 

i =1 

Note that ||5]l turns into an equality in the case of level-based mutation and A = F. We 
would like to use ^ recursively t times in order to estimate E[zh)] for any t, given the initial 
vector E[z(°)]. It will be shown in the sequel that such a recursion is possible under monotonic¬ 
ity assumptions defined below. 

Monotone Matrices and Mutation Operators. In what follows, any ((to + 1) x ?n)-matrix A 
with elements rfo, i = 0 ,..., m, j = 1 ,..., m, will be called monotone iff <5,_i j < Sij for all i, j 
from 1 to to. Monotonicity of a matrix of bounds on transition probabilities means that the 
greater fitness level Ai a parent solution has, the greater is its bound on transition probability 
to any subset Hj, j = 1 ..... ci. Note that for any mutation operator the monotone upper and 
lower bounds exist. Formally, for any mutation operator a valid monotone matrix of lower 
bounds would be A = 0 where 0 is a zero matrix. A monotone matrix of upper bounds, valid 
for any mutation operator is B = U, where U is the matrix with all elements equal 1. These are 
extreme and impractical examples. In reality a problem may be connected with the absence of 
bounds which are sharp enough to evaluate the mutation operator properly. 

If given some set of levels <f >\there exist two matrices of lower and upper 
bounds A, B such that A = B and these matrices are monotone then operator Mut is called 
monotone w.r.t. the set of levels fi,..., <j> m . In this paper, we will also call such operators mono¬ 
tone for short. Note that by the definition, any monotone mutation operator is level-based, 
since f u for all i, j. The following proposition shows how the monotonicity property 

may be equivalently defined in terms of cumulative transition probabilities. 

Proposition 2 A mutation operator Mut is monotone w.r.t. the set of levels fi,... 4> m tfff or an y 
i, i',j G {0,..., to}, such that i > i' , for any genotypes g G A,, g' G A v holds 

PrjMut {g) G Hj} > Pr{Mut(g') G Hj}. 

Proof. Indeed, suppose that A = B and these matrices are monotone. Then for any geno¬ 
types g G Ai and g' G A^, i> i' holds 

Pr{Mut(g) G Hj} > anj > a^j = fi/j > Pr{Mut(</) G Hj}. 
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Conversely, if for any level j and any genotypes g £ Ai and g' £ A */, i > i' holds 
Pr{Mut(< 7 ) £ Hj } > Pr{Mut(</) £ Hj}, then taking i = i' we note that Pr{Mut(g) £ Hj} is 
equal for all g £ Ai and one can assign a l:l = faj = Pr{Mut(g) £ Hj \ g £ At], The resulting 
matrices A and B are obviously monotone. □ 


Propositi on[2]implies that in the case of the canonical partition, i.e. when (do, di , • • ■ p m } is 
the set of all values of d(-), operator Mut is monotone w.r.t. <fi,... iff for any genotypes g 
and g', such that f(g) > (f>(g') r for any r £ IR holds 

Pr{</>(Mut(g)) > r} > Pr{<^(Mut(</)) > r}. 


The monotonicity of mutation operator w.r.t. a canonical partition is equivalent to the defi¬ 
nition of monotone reproduction operator from (Borisovsky and Eremeev |2001) in th e case 
of single-parent, single-offspring reproduction. According to the terminology of Daley {1968 1 , 
such random operators are also called stochastically monotone. 

As a simple example of a monotone mutation operator we can consider a point mutation 
operator: with probability q > 0 keep the given genotype unchanged; otherwise (with proba¬ 
bility 1 — q) choose i randomly from { 1 ..... n} and change gene i. As a fitness function we 
take the function OneMax(g) = 1.9*1' where g £ {0, l} n . Let us assume m = n and define 

the thresholds do := 0. d-| = 1. .... dn = n. All genotypes with the same fitness function value 
have equal probability to produce an offspring with any required fitness value, therefore this 
is a case of level-based mutation. In such a case identical matrices of lower and upper bounds 
A and B exist and they both equal to the matrix of cumulative transition probabilities T. The 
latter consists of the following elements: 7 ^ = 1 for all 1 = 1 ,n, j = 0 , ..., i — 1 , since 
point mutation can not reduce the fitness by more than one level; 7^+1 = (1 — q)(n — i)/n for 


i = 0 , 


,n—l because with probability (1 — q) (n — i)/n any genotype is upgraded; 


_(q + li,i+i if i = l,...,n-1; 

lii S ic ■ 

I g ir 1 = n; 


because a genotype in Hi can be obtained as an offspring of a genotype from A, in two ways: 
either the parent genotype has been upgraded (which happens with probability 7 ^+ 1 ) or it 
stays at level i, which happens with probability q; finally 7 y = 0 . % = ()...., n — 2 , j = 
i + 2,..., n because point mutation can not increase the level number by more than 1. The 
elements of matrix T obviously satisfy the monotonicity condition 7 ^ — Ji-ij > 0 when i ^ j. 
For the case of i = j we have 7 a — 7 i_i,j = q+ (q—l)/n which is nonnegative if q > l/(n + 1). 
Therefore with any q > l/(n + 1), the matrix T is monotone in this example and the mutation 
operator is monotone as well. 


Proposition 3 If A is monotone, then for any tournament size s > 1 and j = 1,..., m holds 


E [Zj t+1) ] > «0 j + EK- - a i- ij)E[ z f } ]- (6) 

i =1 

besides that {6|) is an equality if s = 1, operator Mut is monotone and A is its matrix of cumulative 
transition probabilities. 


Proof. Monotonicity of matrix A implies that a,; ? — Oi-ij > 0 for all j = 1 ,m, j = 
1 ,..., to, so the simple estimate (1 — 4 } ) s < 1 - 4 t] may be applied to all terms of the sum 
in |[5} and we get 

m 

E [“j t+1) ] > Otmj - E(«d - a i-lj)( 1 - E k W D- 

i =1 
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Regrouping the terms in the last bound we obtain the required inequality j6|. 

Finally, note that lower bound |5jl holds as an equality if the mutation operator is monotone 
and A = r, therefore the last lower bound is an equality in the case of monotone A T 
and s = 1 . □ 


Lower Bounds from Linear Algebra. Let W be a (to x m) -matrix with elements w t:i = 
otij — cti-ij, let I be the identity matrix of the same size, and denote a = (aoi, ao m )■ With 
these notations, inequality (jhj takes a short form E|z ! ' t+ bj > a + E[z-b] W. Here and below, 
the inequality sign "<" for some vectors x = (aq,..., x rn ) and y = (jq,..., y m ) means the 
component-wise comparison, i.e. x < y iff aq < y t for all i. The following theorem gives a 
component-wise lower bound on vector E|z' :,+1 ; ' | for any t. 

Theorem 1 Suppose that || • || is some matrix norm. If matrix A is monotone and lim 11W* 11 = 0, 

£—>■ OO 

then for all t > 1 holds 

E[z (t) ] > E[z (0) ]W‘ + a(I - W)" 1 ^ - W‘) (7) 

and inequality (J7J turns into an equation if the tournament size s = 1, the mutation operator used in 
the EA is monotone and A is its matrix of cumulative transition probabilities. 


The proof of this theorem is similar to the well-known inductive proof of the for¬ 
mula St = a(l — w) _1 (l — w*), w £ IR, a £ IR, for a sum of terms aq, .. ., a t in a geometric 
series a* = arc*” 1 . Note that the recursion E|z' :f+ h] > a + E[z l ' ,) ]W is similar to the recursive 
formula S t +i = a + Stw, assuming So = 0. However in our case matrices and vectors 
replace numbers, we have to deal with inequalities rather than equalities and the initial 
element Eiz-'^ may be non-zero unlike So- 


Proof of Theorem |T| Let us consider a sequence of m-dimensional vec¬ 

tors u^,uW,.-,u^,.-, where u^°) = E[zl°)], u^ t+1 ^ = a + u^W. We will show that 
E[zW] > u' f; for any t, using induction on t. Indeed, for t = 0 the inequality holds by the 
definition of u^°k Now note that the right-hand side of jfSj will not increase if the compo¬ 
nents of E[zb)] are substituted with their lower bounds. Therefore, assuming we already 
have E[z( r )] > u' T: for some r and substituting u' for E|z IT ^] we make an inductive step 
E[ z ( t+1 )] > u (T+1) . 


By properties of the linear operators (see e.g. (Kolmogorov and Fomin. 19991, Chap¬ 
ter III, § 29), due to the assumption that lim ||W t || = 0, we conclude that matrix (I — W) -1 

£->-oo 


exists. 

Now, using the induction on t, for any t > 1 we will obtain the identity 


u (t) = u (0) W 4 + a{l - W) -1 (I - W‘) 

which leads to inequality (Jzjl. Indeed, for the base case of r = 1, by the definition of u n 1 we 
have the required equality. For the inductive step, we use the following relationship 


u (r+1) = u ( t) W + a = u (0) W r+1 + a(I - W)" 1 (W - W r+1 + I - W) 


= u^W T+1 + a (I — W) -1 (I — W T+1 ). □ 

In conditions of Theorem [TJ the right-hand side of ([t]) approaches a (I — W) -1 when t tends 
to infinity, thus the limit of this bound does not depend on distribution of the initial popula¬ 
tion. 
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In many evolutionary algorithms, an arbitrary given genotype g' may be produced with 
a non-zero probability as a result of mutation of any given genotype g. Suppose that the 
probability of such a mutation is lower bounded by some e > 0 for all g. g' £ X. Then one 
can obviously choose some monotone matrix A of lower bounds that satisfies a l3 > e for 

1 — e < 1 for all j. In this case one can consider the matrix 
ij |. Due to monotonicity of A we have w %3 = a l3 — a, - \j > 0, 


f, j • Thus, o: rn j cxq j T 

norm HWIloo = maxj^^i l m „. 

so HWIloo = maxj 1 u>ij = max, (a m j — a^j) < 1, and the conditions of Theorem[l]are 
satisfied. A trivial example of a matrix that satisfies the above description would be a matrix A 
where all elements are equal to s. 

Application of Theorem [l] may be complicated due to difficulties in finding the vec¬ 
tor a(I — W) -1 and in estimation the effect of multiplication by matrix W*. Some known 


results from linear algebra can help to solve these tasks, as the example in Subsection 5.2 
shows. However sometimes it is possible to obtain a lower bound for E[z ( d] via analysis of 
the (1,1) EA algorithm, choosing an appropriate mutation operator for it. This approach is 
discussed below. 


Lower Bounds from Associated Markov Chain. Suppose that a partition A 0 ,..., A m de¬ 
fined by 0o,, 0 rn contains no empty subsets and let T denote a (to + 1) x (m + l)-matrix, 
with components 

tij — QZij + 1, i — 0, . . . , 777-, j — 0, . . . , TO 1, 

tim ®-imi f 0; ■ ■ • ; VU. 

Note that T is a stochastic matrix so it may be viewed as a transition matrix of a Markov 
chain, associated to the set of lower bounds on 3 . This chain is a model of the (1,1) EA, which 
is a special case of the (1,A) EA with A = 1 (see Subsection 2.1} . Suppose that the (1,1) EA 
uses an artificial monotone mutation operator Mut / where the cumulative transition proba¬ 
bilities are defined by the bounds q, ) , i = 0,..., to, j = 1,..., to, corresponding to the EA 
mutation operator Mut. Namely, given a parent genotype x, for any j = 1,... ,m we have 
PrjMut^a;) € Aj} = ctij — aii,j- 1 , where i is such that x £ A;. Operator Mutter) maybe simu¬ 
lated e.g. by the following two-stage procedure. At the first stage, a random index k of the off¬ 
spring level is chosen with the probability distribution Pr{fc = j} = a, 3 — ctij- 1 , j = 1,..., to, 
where i is the level of parent x. At the second stage, the offspring genotype is drawn uni¬ 
formly at random from A /,.. (Simulation of the second stage may be computationally ex¬ 
pensive for some fitness functions but the complexity issues are not considered now.) The 
initial search point // 0;i of the (1,1) EA is generated at random with probability distribution 
defined by the probabilities pf ] := Pr{£(°) £ A,} = E[z-°)] — E[^}°\], i = 0De¬ 
noting pW := (Pr{&( 4 ) £ A 0 },... ,Pr{iA*) g A m }), by properties of Markov chains we get 
p(*) = p(o) qv The following theorem is based on a comparison of E[z^)] to the distribution of 
the Markov chain p^h 

Theorem 2 Suppose all level subsets A 0 ,..., A m are non-empty and matrix A is monotone. Then for 
any t = 1,2... holds 

E[zf } ] > p(°> T' L, (8) 

where L is a triangular (to + 1) x (to + l)-matrix with components iij = 1 if i > j and £ i3 = 0 
otherwise. Besides that inequality (|8j> turns into an equation if s = 1, the EA mutation operator is 
monotone and A is its matrix of cumulative transition probabilities. 
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Proof. The (1,1) EA described above is identical to an EA' with A = 1, s = 1 and mutation 
operator Mut 7 . Let us denote the population vector of EA' by z ( 1 *. Obviously, 

m 

Zj*' 1 = y^Pr{6 (t) G A k }, i = (9) 

k—i 

Proposition|3]implies that in the original EA with population size A and tournament size s, the 
expectation E[zW] is lower bounded by the expectation E|z' 1 H since ij^Jl holds as an equality 
for the whole sequence of E[z (,) ] and the right-hand side of (k 5| is non-decreasing on E [z^]. 
Equality = pW T* together with § imply the required bound (J8]). □ 

Note that inequalities l[7| and |8j in Theorems [l] and |2] turn into equalities if these theo¬ 
rems are applied to the EA with A = 1 and monotone mutation operator Mut 7 defined above. 
Therefore both theorems guarantee equal lower bounds on E[z(t)], given equal matrices A. 

Subsections |5.3| and |5.4| provide two examples illustrating how Theorem[2]may be used to 
import known results on Markov chains behavior. The example from Subsection |h4] employs 
Theorem [^for finding a vector o(I — W) -1 , so that Theorem[l]may be applied to bound E [zm ] 
from below. 

3.2 Upper Bounds 

In this subsection, we obtain upper bounds on E[zj i+1) ] using a reasoning similar to the proof 
of Proposition[3] Expression for all j = 1 ,m yields: 

m m 

Pr{ff (t+1) G H j |z« = z} z) = £)0y((l - * i+ i) s - (1 - *) a ), (10) 

2 — 0 2—0 

which turns into equality in the case of level-based mutation. By the total probability formula 
we have: 


E [zf +1) ] = Pr{5 (t+1) e fTj|z (t) = z}Pr{ Z W = z } (11) 

ZG^a 

m 

<^a,e[(i 

2—0 

SO 

m 

E[.f +1) ] < p mj £(ft; A-i,i)E[(l - z®n (12) 

2—1 

Under the expectation in the right-hand side we have a convex function on z}p. Therefore, 
in the case of monotone matrix B, using Jensen's inequality (see e.g. 
we obtain the following proposition. 

Proposition 4 J/B is monotone then 

m 

E[.f +1) ] < /3 mj - - A-ip)(l - EK (t) ]) s - (13) 

i=1 


Rudin 1987 |, Chapter 3) 
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By means of iterative application of inequality (13) the components of the expected popu¬ 
lation vectors E[zh)] may be bounded up to arbitrary t, starting from the initial vector E[z l,J i|. 
The nonlinearity in the right-hand side of (13) , however, creates an obstacle for obtaining an 
analytical result similar to the bounds of Theorems [l]and[2j 

Note that all of the estimates obtained up to this point are independent of the population 
size and valid for arbitrary A. In the Section |4] we will see that the right-hand side of (13) 
reflects the asymptotic behavior of population under monotone mutation operator as A —> oo. 


3.3 Comparison of EA to (1,A) EA and (1+1) EA 


This subsection shows how the probability of generating the optimal genotypes at a given 
iteration of the EA relates to analogous probabilities of (1,A) EA and (1+1) EA. The analysis 
here will be based on upper bound (13) and on some previously known results provided in 
the attachment. 

Suppose, matrix B gives the upper bounds for cumulative transition probabilities of the 
mutation operator Mut used in the EA. Consider the (1,A) EA and the (1+1) EA, based 
on a monotone mutation operator Mut/ for which B is the matrix of cumulative transition 
probabilities and suppose that the initial solutions b' (>! and x'°< have the same distribution 
over the fitness levels as the best incumbent solution in the EA population A 0 . Formally: 
PrjMut^a;) £ Hj} = pij for any x £ A,, i = 0 j = 1 and Pr{M°) £ 

Hj} = Pr{aA°) £ Hj} = Prjmaxj^g...^ f{g^) > fj}, j = 1,..., to. In what follows, for 
any j = 1,..., to by Pj 1 ' we denote the probability that current individual h lr> on iteration r 

of the (1,A) EA belongs to Hj. Analogously cf - denotes the probability Pr{aA T ) g // ; } for the 
(1+1) EA. 


The following proposition is based on upper bound (13) and the results from (Borisovsky 


2001 Borisovsky and Eremeev 2001) that allow to compare the performance of the EA, the 


(1,A) EA and the (1+1) EA. 


Proposition 5 Suppose that matrix B is monotone. Then for any t > 0 holds 

E[Z^ +1) ] < fmm - 03mm - /3 m -l,m)(l ~ P^Y < Pmm ~ {Pmm ~ /3 m -l,m)(l - Qm^Y ■ 

Proof. Let us compare the EA to the (1,A) EA and to the (1+1) EA using the mutation and 
initializationprocedures as described above. Theorem [6] (see the appendix) together with 


inpr 

Proposition (lfimply that E\zm ] = Id'lyh) g H rn } < PY, 1 for all t> 0. Furthermore, Theorem 5 
from (Borisovsky and Eremeev 2001 (see the appendix) implies that Pm < Qm A ' for all t > 0. 
Using Proposition|4]and monotonicity of B, we conclude that both claimed inequalities hold. □ 


4 EA with Monotone Mutation Operator 

First of all note that in the case of monotone mutation operator, two equal monotone matrices 
of lower and upper bounds A = B exist, so the bounds (5) and (12) give equal results, and 
assuming T = A = B we get 

m 

E[zj‘ +1) ] = y mj - ^(7 ij - 7i-i,j)E[(l - zY ] Y], j = 1, • • •, m, t = 0,1,.... (14) 

i—l 

This equality will be used several times in what follows. 
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In general, the population vectors are random values whose distributions depend on A. To 
express this in the notation let us denote the proportion of genotypes from Hi in population X f 

by zf\ A), i = 1 ,... , to . 

The following Lemma [l] and Theorem [3] based on this lemma indicate that in the case 
of monotone mutation, recursive application of the formula from right-hand side of upper 
bound |T3| ) allows to compute the expected population vector of the infinite-population EA at 
any iteration t. 


Lemma 1 Let the EA use a monotone mutation operator with cumulative transition probabilities ma¬ 
trix T, and let the genotypes of the initial population be identically distributed. Then 
(i) for all t = 0 , 1 ,... and i= 1 ,... ,m holds 


lim 

A—^oo 


( E [(l-4 4) (A)) S ] - (l-E[zf } (A)]) S ) 


(ii) if the 

ib°), u (1 ),..., ub) ; ... is defined as 


= 0 ; 

m-dimensional 


1 -z? , v i v 

sequence of 

=E[z( 0) (A)], 

m 

Mj t+1) = 7 mj - - 7i-l,i)(l - Ui t] ) S 

i—1 

for j = 1,..., m and t > 0. Then lim E[zb)(A)] = for all j = 1,..., m at any iteration t. 


(15) 
vectors 

(16) 
(17) 


A— 


The main step in the proof of Lemma [l](i) will consist in showing that for a supplementary 
random variable X = (1 — zf\ A)) s — (1 — E[z-^(A)]) S , the value of |E[A']| is upper-bounded 
by an arbitrary small £ > 0. This step is made by splitting the range [—1,1] of A into a "high- 
probability" area and a "low-probability" area in such a way that |A| is at most e in the "high- 
probability" area. Analogous technique is used e.g. in the proof of Lebesgue Theorem, see 
e.g. Kolmogorov and Fomin (19991, Chapter VII, § 44. 

Proof of Lemma [lj From | fl4| |, we conclude that if statement (i) holds, then with A — > oo, 
the convergence of E[zb)(A)] to will imply that E[z^ t+ 1 ^(A)] —> uh +1 ). Thus, statement (ii) 
follows by induction on t. 

Let us now prove statement (i). Given some t, to prove © we recall the sequence of i.i.d. 
random variables T\ , T! 2 , where T k = 1, if the fc-th individual of population X t belongs 

to Hi, otherwise T' k = 0. By the law of large numbers, for any i = 1,..., m and e > 0, we have 


lim Pr ■ 

A—>-oo 


7 -i 

k=l 


A 


- E[2j; 


<£> = 1 . 


Note that ffk=i^k/^ = z i^W- Besides that, due to Proposition [lj E[I[] = Pr{IJ = 1} = 
E[ 4 4 ) (A)]- (ha the case of t = 0 this equality holds as well, since all individuals of the 
initial population are distributed identically.) Therefore, for any e > 0 the convergence 


Pr 


| zf > { A) — E[z'P (A)] < e| —>■ 1 holds. Now by continuity of the function (1 — x) s , it fol¬ 


lows that 


lim Pr { 1(1 - 4 °(A)) s - (1 - E[4 t) (A)]) s > e} = 0. 
A —>oo 11 J 


Let us denote Fa(:e) := Pr |(1 — ^(A)) s - (1 — E[^(A)]) S < xj. 


Then 


lim (e 

A—>-oo V 




(l-E[^(A)])») 


OO 


lim 

A—>-oo 


xdFx(x) < 


— OO 
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< lim Pr • 

\—>oo 


(l-^)(A)r-(l-E[^)(A)])‘ 


> e f + lim / 
J A —>-00 J 


edF\(x) < e 


|x|<e: 


for arbitrary e > 0 , hence © holds. □ 

Combining equality |T4| with claim (i) of Lemma [l] we obtain a recursive expression 
for E[z®] in the infinite-population EA, which is formulated as 


Theorem 3 If the mutation operator is monotone and individuals of the initial population are dis¬ 
tributed identically, then 


lim E[^ t+ 1 ) (A)] = 7 mj - - li- i,j)(l - E^f^A )]) 8 (18) 

^—>■00 J z ' 

i= 1 


for all j = 1 ,..., m, t > 0 . 

For any i, j and t > 0, the term a - 1 of the sequence defined by jl7j is nondecreasing in 

uf 1 - ) and in s as well. With this in mind, we can expect that the components of population 
vector of the infinite-population EA will typically increase with the tournament size. Theo¬ 
rem |4] below gives a rigorous proof of this fact under some technical conditions on distribu¬ 
tions of Mut and X°. 

Theorem 4 Let and z (,) correspond to EAs with tournament sizes s and s, where s < s. Besides 
that, suppose that Mut is monotone with 7 m j > 7 0j - for all j = 1 ,,m and the individuals of initial 
populations are identically distributed so that Pr{</ 0 ^ G Ft,} G (0, l)/or all i = 1,... ,m. Then for 
any t > 0, given a sufficiently large A, holds 

E[f i ( t )(A)]>E[4 t) (A)], i = 1,..., to. 

Proof. Let the sequences {u®} and {tb* 1 } be defined as in Lemma [lj corresponding to 
tournament sizes s and s. By the above assumptions, ib 0 - 1 = fb 0 - 1 . 

Now since Pr{g(°) G Hi} G (0,1) for all i = 1 ,... ,m, we have uf* = u -°^ G (0,1) for 
any i = 1,..., m. Thus, for all j = 1 ,,m holds 

m m 

uf = 7 mj - X](7 ij - 7i-i,i)(l - uf*) s < Tmj - - 7i-i,i)(l - uf*) s = uf, (19) 

2=1 2=1 

since s < s and 7 ^ — Ti-i,j > 0 at least for one of the levels i according to the assumption that 
Tmj > 7 c )j- Due to the same reason, for all j = 1,... ,m from the last equality in (19) we get 
uf < 7 m j < 1. Using the fact that (1 — uf*) s < 1 — uf * and re-arranging the terms as in the 
proof of Proposition[3]we get 


uf - T'oj + - 7 *-i,i) u f > °- 

2=1 

To sum up, for t = 1 we have uf * < uf*, uf * G (0,1) and uf* G (0,1). 

Furthermore, if we assume that for all i = 1 ,... ,m holds uf ^ < uf 1 \ uf 1 ' 1 G (0,1) 
and uf 1 ' ) G (0,1) then analogously to <jl9[) we get uf ' 1 < uf* for all j = 1,, m. Besides that. 
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just as in the case of t = 1 we get uf' £ (0,1) and iif 1 £ (0,1). So by induction we conclude 

that iif 1 < uf 1 for all j = 1..... rn and all t > 0. 

Finally, by claim (ii) of Lemma 111 for any i and t, given a sufficiently large A, holds 

E[f i «(A)]>E[ Z f(A)]. □ 

Informally speaking, Theorem|4]implies that in the case of monotone mutation operator an 
optimal selection mechanism consists in setting s —> oo, which actually converts the EA into 
the (1,A) EA. 


5 Applications and Illustrative Examples 


5.1 Examples of Monotone Mutation Operators 


Let us consider two cases where the mutation is monotone and the matrices L have a similar 
form. 

First we consider the simple fitness function OneMax(g). Suppose that the EA uses the 
bitwise mutation operator, changing every gene with a given probability p m/ independently of 
the other genes. Let the subsets H m be defined by the level lines p 0 = 0. p, = 1,..., 0 rn = 

m and m = n. The matrix T for this operator could be obtained using the result from (Back 
1993), but here we shall consider this example as a special case of a more general setting 


Let the representation of the problem admit a decomposition of the genotype string into d 
non-overlapping substrings (called blocks here) in such a way that the fitness function equals 
the number of blocks for which a certain property K, holds. The functions of this type belong 
to the class of additively decomposed functions, where the elementary functions are Boolean 
and substrings are non-overlapping (see e.g. (A 

(9,t) 


(Muhlenbein et al. 


1999)). Let K{g,t) = 1 if 1C 
= 0 otherwise (here l = 1, d). 


holds for the block £ of genotype g, and K(g, 

Suppose that during mutation, any block for which 1C did not hold, gets the property 1C 
with probability f, i.e. 

Pr{A'(Mut( ff ),£) = 1| IC(gJ) = 0 }=f, £= 1 


On the other hand, assume that a block with the property K, keeps this property during muta¬ 
tion with probability r, i.e. 

Pr{AT(Mut(g), £) = 1| K(g,£) = 1} = r, £ = 1, ...,m. 


Let nn = d and the subsets //q. .... H rn correspond to the level lines </>o = 0,pi = 1,..., <t> rn = m 
again. In this case the element 7 , :? of cumulative transition probabilities matrix T equals the 
probability to obtain a genotype containing j or more blocks with property 1C after mutation 
of a genotype which contained i blocks with this property. Let P{k ', k) denote the probability 
that during mutation k! blocks without property 1C would produce k blocks with this property 
and let Q(i, l ) denote the probability that after mutation of a set of i blocks with property 1C, 
there will be at least l blocks with property 1C among them. (If / > i then Q(i. 1) := 0.) With 
these notations, 

m—i 

7 ij = P ( m - *’ ~ k )■ 

k —0 

Clearly, P(k',k) = ( k k )r k (l - f) k '~ k and Q(i, l) = (*)(1 - rYr l ~' y . Thus, 

m-i , .s min{i,i-(j-fc)} ... 

^• = E( m r*)# :fc ( i -?r- i - fc E rja-rjv-". ( 20 ) 

k—0 ^ ' v =0 ^ ' 
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It is shown in (Eremeev 2000 Borisovsky and Eremeev 2008 1 that if r > r then matrix T 


defined by l [20| is monotone. 

Now matrix T for the bitwise mutation on OneMax function is obtained assuming that 
r = (1 — r) = p m and m = d = n. This operator is monotone in view of the above mentioned 
result, if p rn < 0.5, since in this case r >r. The monotonicity of bitwise mutation on OneMax 
is used in works of Doerr et al. (2010) and |Witt| |2013j. 

Expression (j20j) may be also used for finding the cumulative transition matrices of some 
other optimization problems with a regular structure. As an example, below we consider the 
vertex cover problem (VCP) on graphs of a special structure. 

In general, the vertex cover problem is formulated as follows. Let G = (V, E) be a 
graph with a set of vertices V = {r>i,..., i>|y|} and the edge set E = {e±,..., e\ E \} where 
e i = {u(i),v(i)} C V, i = 1 ,..., \E\. A subset C C V is called a vertex cover of G if every 
edge has at least one endpoint in C. The vertex cover problem is to find a vertex cover C* of 
minimal cardinality. 

Suppose that the VCP is handled by the EA with the following representation: each gene 
g l £ {0, 1}. i = 1..... \E\ corresponds to an edge e, of G, assigning one of its endpoints which 
has to be included in the cover C{g). To be specific, we can assume that g‘ = 1 means 
that u(i) £ G(g) and g l = 0 means that v(i) £ C(g). The vertices, not assigned by one of 
the chosen endpoints, do not belong to C(g). On one hand, this edge-based representation 
is degenerate in the sense that one vertex cover C may be encoded by different genotypes g. 
On the other hand, any genotype g defines a feasible cover C(g). A natural way to choose the 
fitness function in the case of this representation is to assume cj)(g) = \ V \C{g)\. 

Note that most publications on evolutionary algorithms for VCP use the vertex-based rep¬ 
resentation with | V\ genes, where g :l = 1, j = 1..... | L implies inclusion of vertex Vj into G 
(see e.g. (Neumann and Witt, 2010), § 12.1). In contrast to the edge-based representation, the 
vertex-based representation is not degenerate but some genotypes in this representation may 
define infeasible solutions. 

Following (Saiko 1989) we denote by G(m) the graph consisting of m disconnected trian¬ 
gle subgraphs. Each triangle is covered optimally by two vertices and the redundant cover 
consists of three vertices. In spite of simplicity of this problem, it is proven in (|Saiko 1989) 
that some well-known algorithms of branch and bound type require exponential in m number 
of iterations if applied to the VCP on graph G{m). 

In the case of G(m), the fitness (big) coincides with the number of optimally covered trian¬ 
gles in C(g) (i.e. triangles where only two different vertices are chosen), since covering non- 
optimally all triangles gives C(g) = V and each optimally covered triangle decreases the size 
of the cover by one. Let the genes representing the same triangle constitute a single block, and 
let the property 1C imply that a triangle is optimally covered. Then by looking at the two pos¬ 
sible ways to produce a gene triplet that redundantly covers a triangle, (i) given a redundant 
triangle and (ii) given an optimally covered triangle, we conclude that (i) r = 1 — (1 — p, n ) 3 

and (ii) r = 1 — p m (1 — ) 2 — y; t 2 n (1 — p m ). Using 120 ' we obtain the cumulative transition matrix 
for this mutation operator. It is easy to verify that in this case the inequality r > r holds for 
any mutation probability p m , and therefore the operator is always monotone. 


Computational Experiments. Below we present some experimental results in comparison 
with the theoretical estimates obtained in Section |3j To this end we consider an application 
of the EA to the VCP on graphs G{m). The average proportion of optimal genotypes in the 
population for different population sizes is presented in Figure[2] Elere m = 8, />,,, = 0.1, s = 2 
and z'T = o (these parameters are chosen to ensure clear visibility on plots). The statistics 
is accumulated in 1000 independent runs of the algorithm where for each t only one individ- 
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ual g\ f> was checked for optimality. Thus for each t we have a series of 1000 Bernoully trials 
with a success probability Pr{<?^ £ //,„ } = E[ z 1 ,!,’] which is estimated from the experimental 
data. The 95%-confidence intervals for success probability in Bernoully trials are computed 
using the Normal approximation as described in (Cramer , 19461, Chapter 34. 

The experimental results are shown in dashed lines. The solid lines correspond to the lower 
and upper bounds given by the expressions ([7]) and (l3| l. The plot shows that upper bound ( [13] ) 
gives a good approximation to the value of Zm even if the population size is not large. The 
lower bound |[7| coincides with the experimental results when A = 1, up to a minor sampling 
error. 



Figure 2: Average proportion of optimal VCP solutions and the theoretical lower and upper 
bounds as functions of the iteration number. Here s = 2, A = 1, 2 and 10. 


Another series of experiments was carried out to compare the behavior of EAs with differ¬ 
ent tournament sizes. Figure [3] presents the experimental results for 1000 runs of the EA with 
p m = 0.1, A = 100 and z% = 0 solving the VCP on G(8). This plot demonstrates the increase 
in the average proportion of the optimal genotypes as a function of the tournament size, which 
is consistent with Theorem |4] The 95%-confidence intervals are found as described above. 

5.2 Lower Bound for Randomized Local Search on Unimodal Functions. 

First of all let us describe a Randomized Local Search algorithm (RLS) which will be implicitly 
studied in this subsection. At each iteration of RLS the current genotype x is stored. In the 
beginning of RLS execution, x is initialized with some probability distribution (e.g. uniformly 
over X). An iteration of RLS consists in building an offspring y of x by flipping exactly one 
randomly chosen bit in x. If o('y ) > 4>(x) then x is replaced by the new genotype y. The process 
continues until some termination condition is met. 

Below we will illustrate the usage of Theorem[l]on the class of AUnimodal functions. In 
this class, each function has exactly t distinctive fitness values fa < <t>i < ... < <t>i- 1, and 
each solution in the search space is either optimal or its fitness may be improved by flipping a 
single bit. Naturally we assume that m = £ — 1 and that level A m consists of optimal solutions. 

As a mutation operator in the EA we will use a routine denoted by MutRLS : given a geno¬ 
type g, this routine first changes one randomly chosen gene and if this modification improves 
the genotype fitness, then MutRLS outputs the modified genotype, otherwise MutRLs(5) out- 
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Figure 3: Average proportion of optimal solutions to VCP and the theoretical upper bound, as 
functions of the iteration number. Here A = 100, s = 1.2 and 10. 


puts the genotype g unchanged. Note that in the case of A = 1, the EA with Mutexs mutation 
becomes a version of RLS. The lower bounds from Section [3] are tight for A = 1 (which im¬ 
plies s = 1), therefore the following analysis in this subsection may be viewed primarily as a 
study of Randomized Local Search. 

Mutation operator Mut^LS never decreases the genotype fitness and improves any non- 
optimal genotype with probability at least 1 /n, so we have a, :) = 1 for all i = 1,... , to, j = 
0,..., i and = 1/n for i = 0,..., m — 1. The chances for improvements by more that one 

fitness level are not foreseeable, so we put ay, = 0 for all i = 0,..., m — 2, j = i + 2,..., to. 
Note that this matrix A is monotone. 

Now a = (1/n, 0,..., 0) and the matrix W consists of the following elements: 

r 1/n if i = j + 1; 
w ij = otij — cxi-ij = < 1 — 1/n if i = j. 

v 0 otherwise. 

In order to apply Theorem [T] we also need to choose an appropriate matrix norm and evaluate 
this norm for matrix W. In this particular application we will use 11 • 1 12 , which is the matrix 
norm induced by the Euclidean vector norm in IR m . It is well-known that for any matrix W 
holds 11W| 1 2 = \/A m ax, where A max is the maximal eigenvalue of matrix WW T . Here and 
below W T denotes the transpose of matrix W. 

It is easy to check that matrix WW T is composed of zero elements everywhere except for 
m diagonal elements, m — 1 superdiagonal and m — 1 subdiagonal elements. In particular, it 
has identical elements (1 + (n — 1 ) 2 ) /n 2 on the diagonal and all superdiagonal and subdiagonal 
elements are equal to (n— 1) /rr. This matrix WW T belongs to the class of tridiagonal Toeplitz 
matrices and its maximal eigenvalue is 


1 + (n — l) 2 2 (n — 1) 7r 

Aniax = o T o COS —. 


n* 


(see Theorem [7]in the appendix). Therefore 


IWIU = \ 1 — 


2(n — 1) 


( 1 - cos i)- 
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So 11 W| 1 2 < 1 and since matrix A is monotone we can apply Theorem [T] 

Let us denote e := (1,1,..., 1) € IR m . The vector v = e satisfies the equation v = a (I — 
W) -1 and since 11W| 1 2 < 1, the right-hand side in inequality |7]) of Theorem [l] tends to e as 

t —> oo. 

In order to obtain an explicit lower bound on TL[zm] for any given t, we will evaluate the 
speed of convergence of the right-hand side in inequality (JtJ to e. Note that by properties of 
matrix norms we have 

lleW^b < ||e|| 2 .||W||‘ = Vm||W|||. (21) 

Thus for any distribution of initial population Theorem [l] gives a lower bound 

E[z (t) ] > e(I - W‘) >e-v/m||W|||-e, 


where the last inequality holds because each component of vector eW* is upper-bounded by 
11eW* 11 2 which is at most \Jm \ | W11 \ by inequality |2l) . 

Finally, independently of population size A and tournament size s we get a lower bound 
for the proportion of optimal genotypes in the EA population: 

E [*m] > 1 - (l - 2(n n2 ^ (l - COS ^ . (22) 

The Taylor expansion for cos(x) gives 

2 4 2 2 

7r 7r 7r n n 

C0S 1 - 1 _ 2P + 24P - 1 “ 

Now since \f 1 — x < 1 — x/2 and ln(l — x) < —x, we obtain 


EUi 


> 1 - y/l-l ( 1 - 


7r 2 (n — l)(f — l) 5 

2f 4 n 2 


> 1 — exp 


f ln(£- 1) 

l 2 


ti\ 1 

Pn 



In the case of RLS, i.e. when A = l, this gives the following tail bound 

Corollary 1 The probability that the maximum of a fitness function from £-Unimodal is first reached 
after more than t iterations of RLS is at most 2 « 1 (^ 2 - 2 of 


A positive feature of this tail bound is that it approaches to 0 exponentially fast in t. A 
weakness of Corollary[l]is that its bound is grater than 1 (and therefore useless) when t < ln(/:— 
I )f' 1 nj (7T 2 — 20£ _1 ). The obtained tail bound may be improved for some relatively small t using 
the expected RLS runtime bound and Markov inequality. Let T denote the number of fitness 
evaluations made in RLS until the optimum is achieved. Then the RLS runtime E[T] < n(f — 1) 
since each fitness level requires on average at most n iterations of RLS. By Markov inequality 
we have Pr{T > / } < n(t — 1 )/t. This tail bound becomes meaningful as soon as t reaches 
n(£ — 1) but it does not give an exponential convergence and therefore yields to Corollary[l]for 
large t. It would be interesting to compare our tail bounds to those obtainable by the approach 
from (Lehre and Witt , 20141 but tight analysis of RLS is beyond the scope of this paper. 


5.3 Lower Bounds and Runtime Analysis for 2-SAT Problem 


The Satisfiability problem (SAT) in general is known to be NP-complete (Garey and Johnson 
19791, but it is polynomially solvable in the special case denoted by 2-SAT: given a Boolean for¬ 


mula with CNF where each clause contains at most two literals, find out whether a satisfying 
assignment of variables exists. 


19 


















Let n be the number of logical variables and let m be the number of clauses in the CNF. A 
natural encoding of solutions is a binary string g where gi = 1 if the z-th logical variable has 
the value "true" and otherwise gi = 0. 

We consider an EA with the tournament size s = 1 and the following mutation opera¬ 
tor MutgAT- Draw randomly a clause which is not satisfied, choose one variable among the 
variables of the clause at random, and modify this variable. Otherwise keep the solution un¬ 
changed. This method of random perturbation was proposed in the randomized algorithm 
of Papadimitriou ( 1991) for 2-SAT which has the rimtime 0(ri 2 ). if the CNF is satisfiable. A 
generalization of the algorithm from ([Papadimitriou 1991) to the general case of SAT, known 


as WalkSat algorithm, shows competitive experimental results (Selman et al. 1996). In the 


special case of SAT, where each clause contains at most _fc literals, which is denoted by fc-SAT, 


algorithm WalkSat has a runtime bound 0((2 — 2 /k) k ) (Scheming 


19991. 


A fitness function does not influence the EA execution when s = 1 but it will be useful for 
our theoretical analysis. Let us assume that f(g) equals the Flamming distance to a satisfying 
assignment g*. Flere and below, we assume that at least one satisfying assignment g* exists. 

For any non-satisfying truth assignment the improvement probability is 1/2, so we can 
apply the following monotone bounds: a.^ = 1 for all i = 1. .... m. j = 01; a v , + i = 
1/2 for i = 0,... ,tti — 1; 

_ j 1/2 if i = 1,..., to — 1; 

1 if i = m; 


&ii — 


= 0,i = 0,..., to — 2, j = i + 2, 


, TO. 


These lower bounds define the Markov chain 


transition probabilities T with tij = onj — &i,j+ i, i = 0,..., to, j = 0,..., to — 1 and t im = 

It turns out that this matrix T is the same as the 


i = 0, 


3.1 


; according to Subsection 
transition matrix of the symmetric Gambler's Ruin random walk with one reflecting barrier 
(state 0) and one absorbing barrier (state to): to,i = 1, U,i +1 = U,i -1 = 1/2 for i = 1,..., m — 1, 
tmm = 1/ all other elements tij are equal to zero. The result from (Papadimitriou, 19911 implies 


that, regardless of the initial state, there exists a constant c > 0, such that after cn 2 transitions 

the absorbing probability of this random walk is at least 1/2. This means that pm >1/2 and 
the TO-th component of the vector pD)T ( L is at least 1/2 as well. Therefore Theorem [2] yields 

Corollary 2 If the EAfor 2-SAT has the tournament size s = 1 and the mutation operator MutgAT 
then the probability to generate a satisfying assignment in population X cn is at least 1/2 for some 
constant c > 0. 

It makes sense to apply Theorem [2] only in the case of s = 1 in this example, since for 
s > 1 the tournament selection is impossible without computing the Flamming distance to a 
satisfying assignment which is unknown. 

If the EA with s = 1 and mutation MutsAT is restarted every t max iterations and / Tnax = cn 2 , 
then the overall runtime of this iterated EA is 0(Xn 2 ) by Corollary [2] and Markov inequal¬ 
ity. Note that Corollary [2] holds for any distribution of the initial population, so the run¬ 
time bound 0(Xn 2 ) applies to the EA without restarts as well. In a similar way the EA with 
MutgAT can simulate the randomized algorithm of Schoning (Scheming [1999) for fc-SAT with 
runtime 0((2 - 2 /k) k ). 


5.4 Lower Bounds and Runtime Analysis for Balas Set Cover Problems 

In general the set cover problem (SCP) is formulated as follows. Given: a ground set M and 
a set of covering subsets Mj C M, with indices j £ U := (1,... ,n}. A subset of indices 
J C U is called a cover if U j^jMj = M. The goal is to find a cover of minimum cardinality. In 
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what follows, for any i £ M we denote by Ni the set of numbers of the subsets that cover an 
element i, i. e. N, = {j : i £ Mj}. Note that an instance of SCP may be defined by a family of 
subsets {Mj} or, alternatively, by a family of subsets {Ni}. 

Suppose the binary representation of the SCP solutions is used, i.e. genes gj £ {0,1 },j £ U 
are the indicators of the elements from U, so that J(g) = {j £ U : gj = 1}. If ■1(g) is a 
cover then we assign its fitness <p(g) = n — \J(g)\', otherwise <f>(g) = r(g), where r(g) < 0 is a 
decreasing function of the number of non-covered elements from M. 

Consider a family B(n,k ) of set cover problems introduced by Balas (1984). Here it is 
assumed that M = {1,..., ( n _". +1 )} and that all (n — k + l)-element subsets of U are given as 
subsets Thus any collection of less than k elements from U belongs to U\N, 

for some i £ M and does not cover the element i £ M. At the same time any subset./ C [J 
of size k covers all elements of M and therefore it is an optimal cover. Larger subsets are 
non-optimal covers. 

Since any k -element subset of U is an optimal cover, family B(n, k) is solvable trivially. 
Nevertheless this family is known to be hard for general-purpose integer programming algo¬ 
rithms (Balas 1984 Saiko 1989) . In particular, it was shown in (Saiko 1989| that pro blems 
from this class are hard to solve using the L-class enumeration method (Kolokolov} |1996). 


When n is even and k = n/ 2, the L-class enumeration method needs an exponential number 
of iterations in n. In what follows we analyze the EA in this special case. 

Note that any /'-element subset J CU for i < k leaves ( n _ 7 jCl 1 )) elements of the ground set 
uncovered, regardless of the choice of elements in J. So in the case of tournament selection, 
equivalently to studying the EA on family B(n. n/2) we may study the EA where the fitness is 
given by a function of unitation, so that 


</>(£?) = 


R(\\g\\i) 

L{\\9\\i) 


if llfflli > ro/2; 
otherwise. 


where function R is decreasing, function L is increasing and L(^ — 1) < R(n). 

Consider the point mutation operator with tunable parameter q > 0 defined in Subsec¬ 
tion 3.1 Let to = n/2 and let the thresholds <j) 0 , </>i ,..., </> m be equal to fitness of genotypes that 
contain 0,1,..., m genes "1" accordingly. Note that J(g) is a cover iff (j>(g) > <j> m . 

We have the following lower bounds: = 1 for all / = 1,. .., to, j = 0,.. ., i — 1; = 

(1 — q)(n — i)/n for i = 0,... ,m — 1; 


q + a iti+ 1 if i = 1,... ,m - 1; 
q if i = to; 


otij = 0, i = 0,.. ., m — 2, j = i + 2,..., m. These lower bounds o i:j coincide with the corre¬ 
sponding cumulative transition probabilities except for level i = rn, where we pessimistically 
assume a mm = q (in fact we could safely put a mm = 0.5(1 — q) + q but a mm = q is chosen to 
match the model of Ehrenfests in what follows). It is easy to verify that A satisfies the mono¬ 
tonicity condition when q > 1 / (n + 1) just as we verified this in the example of monotone 
mutation in Subsection [3d] 

In case we are interested in runtime bounds for the EA, rather than expected values of 
vector zb), we can assume a' mm = 1. All other non-zero lower bounds a, :l defined above 
could be relaxed by putting o', = 1/2. In this case we would have the associated Markov 


chain with a transition matrix T , ) the same as in Subsection 5.3 resulting in the same EA 
runtime bound 0(Xn 2 ). We shall avoid these simplifications, however, in order to obtain a 
tighter runtime bound by means of the following corollary. 
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Corollary 3 Suppose that the EA with a tournament size s > 1 uses the point mutation operator 
with parameter q > 1 /(n + 1). Then given A' 0 = (0,..., 0), there exists a constant c, such that the 
probability to reach an optimum of problem B(n,n/2) within \cn\nri\ iterations is S!(n -0 - 5 ). 

To prove this corollary, first we will obtain a lower bound on E [zm] for t —> oo, using 
Theorem[2]and the stationary distribution of the associated Markov chain pW = T'. After 
that, analogously to the proof of Corollary [lj we will compute a lower bound on E [zm j for 
finite t, using Theorem [T] 

Proof of CorollaryBj The Markov chain associated to the set of lower bounds a tj defined 
above has the following nonzero transition probabilities 

tu=q, t iii - 1 = (l-q)i/n, t iji+1 = (1 - g)(l - i/n), i = 


^0,1 — 1 9: tm,m—1 1 Q ; 

All other elements of matrix T are equal to zero. 

The stationary distribution of the associated Markov chain may be found from the well- 
known model for diffusion of P. Ehrenfest and T. Ehrenfest. Consider n molecules in a rect¬ 
angular container divided into two equal parts A and B. At any time t, one randomly cho¬ 
sen molecule moves to another part. The state of the system is defined by the number of 
molecules j, j = 0,..., n, in container A. The corresponding random walk has transition 
probabilities 

T j,j -1 = j/n, r jJ+1 = 1 - j/n, j = 1 ,. .., n - 1 , 


A),l — 1; — 1. 

The stationary distribution in Ehrenfests model (see e.g (Feller , 1957 1, chapter. 15, § 6) is given 
by : = (")/ 2 "> 3 =0,... , n. Grouping each couple of symmetric states (i.e. the state where A 
contains j molecules, B contains n—j molecules and the state where A contains n—j molecules 
and B contains j molecules, j = 0,.... n/2) into one state we conclude that the Markov chain 
with transition matrix T has the stationary distribution u = (27 Ti, ..., 27r m ) for any q < 1. So 
by Theorem [2] vector uL is the limiting lower bound for E[z ( b] as t —> oc. 

We are interested in transient behavior of the EA, so we will obtain a lower bound for 
the expected population vector E[zb)] ; given a finite t, using Theorem [l] Consider the matrix 
norm HWHoo = max^y...^ Y^jLi \ w ij \ which is associated to the vector norm || • ||i in the case 
of left-hand side multiplication of matrices by vectors. For the matrix W, corresponding to the 
set of lower bounds ay, defined above, we have | W|= 1- 2(1 q)/n,\.e. the condition 

lim HW^loo = 0 is satisfied for any q < 1. 

t—¥ OO 

Let us find the vector v = a(I — W) _1 , which is the limit of the right-hand side in inequal¬ 
ity ([7]) as t —>• 00. To this end, it suffices to solve the system of equations 


n — i + 1 

- Vi-i - 

n 


, + 1 1 

- v i+1 - = 0, 

n n 



(23) 


n +1 1 

vi - v 2 - = 1, 

n n 


—v n /2-i- 


2 n 


n~ 1 n 

■ v n /2 - = 0. 


(24) 


Recall that the right-hand sides in inequalities (|7j and (J8jl of Theorems[l]and[2]are equal, given 
equal matrices A. This suggests to put v = uL, i. e. 


Vi = 



1 


2 n ~ 1 ’ 



(25) 
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Again let e = (1,..., 1). By properties of the norms under consideration, vW 1 < 
||vW 4 | |ie < ||v||i • 11W| l^e < mHWH^e, so by Theorem]!] 

E[z (4) ] > E[z (0) ]W 4 + a(I- W)" 1 ^- W‘) > a(I - W)" 1 - o(I - W^W 4 > v-m||W||^e 


for any t. With q = 1 / (n + 1), the average proportion of feasible genotypes is lower-bounded 


by 




since 11WI 


= 1 - 


2(1-9) 


\/2i m n+05 e 71 < n\ < en n+0 5 e n we conclude that v m = 


(»%) 


= f—j. Using 125' and Stirling's inequality 


, —0.5'i 


2 (Ur = n(n- 

that a constant c is so lagre that cn In n > 1 In —, for t = \cn In n~\ we have 


Now assuming 


n 





sof (^) t <^andE[^ ) ]>^. 

By assumption the initial population consists of all-zero strings. Therefore the presence of 
at least one individual from H m in the current population implies that an optimal solution 
to a problem B(n,n/ 2) was already found at least once. Thus, in view of Proposition HI 
after [cn In n\ iterations of the EA, the probability of finding an optimum is at least U(n _0 °) 
and the corollary is proved. □ 


If the EA is restarted with X° = (0,..., 0) every f max = [cnlnn] iterations, then by Markov 
inequality the overall runtime of this iterated EA is (){Xn [ " 5 log n) for any A. 

The tools for the non-elitist EA analysis from (Corns et al. |2014| Dang and Lehre 2016 


Eremeev, 20161 can be adjusted to upper-bound the runtime of the EA on B(n,n/2), but in 


such a case, a non-zero selection pressure would be required with a sufficiently large s and the 
results would hold only for A = Uflog n). 


5.5 Upper Bound on Proportion of Optimal Genotypes in Case of OneMax 

The upper bounds on vector z <t> obtained in Proposition [!] are not likely to be suitable for 
obtaining the lower bounds on runtime of the EA in absolute terms due to nonlinearity in the 
right-hand side of ( 13). There are other methods for finding such lower bounds on the runtime 
proposed e.g. in (Badkobeh et al.||2014 Lehre 2010 Sudholt 2013|. The upper bounds on 
vector z r< ) however may be used for comparison of the EA to the (1,A) EA and the (1+1) EA as 
it was suggested in Proposition |5] 

To illustrate such a comparison let us consider the EA with bitwise mutation operator Mut 
in the case of OneMax fitness function and assume that eg := i. i = 0 ,... . n. Analogously to 
the notation form Section |i] Pn' 1 and Qn ' will stand for the probability to have an optimal 
current individual on iteration r of (1,A) EA and on iteration r of the (1+1) EA, respectively. In 
these algorithms we assume that the bitwise mutation operator Mut 4 = Mut is used and the 
initial solution is chosen uniformly from X. Proposition |5] yields the following 


Corollary 4 Suppose that the fitness function is OneMax and the initial population of the EA con¬ 
sists of A copies of the same solution, chosen uniformly from X, and the EA uses the bitivise mutation 
operator with p m = 1/n. Then for any t > 0 holds 


e k t+1) ] <-- ^4t ( 1 - ))s Q - X)y - 

e e(n — lj e e[n — lj 
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In particular, if the tournament size s = 2 then E[z^ t+1) ] < 0.74 + 0(n 1 ) and E[z^ +1) ] < 
0.74 Q^ x) + 0(n~ 1 ). 


Proof. In the case of OneMax fitness function the bitwise mutation operator with p m = 
1/n is monotone ( [Borisovsky and Eremeev 2008). Application of Proposition [5] yields 
E[2n +1) ] < 7„ n — fi nn — 7n-i,n)(l — Pn^Y for the cumulative transition probabilities as¬ 
sociated with this monotone mutation operator. It is easy to see that p n -i , n < e _1 /(n — 1 ) and 
7 n ,n < e -1 , since (1 — l/n) n < e _1 . Thus, for the (1,A) EA 


E[4 t+1) ] (i - pPy 

e \e e[n — 1) / 

as required. In the case of s = 2 this inequality implies that E[s^ +1 ^] < Pn ^ + e (n-i) < 

0 ■7 / lPn' > + 0(n _1 ). The result for (1+1) EA follows analogously. □ 


A superiority of the (1+1) EA over other evolutionary algorithms in the case of OneMax 
fitness function and bitwise mutation with p rn < 0.5 is well-known from ( Borisovsky} |200r 


Borisovsky and Eremeev 2008 Sudholt 2013). Corollary |4] allows to measure the superiority 


of (1+1) EA and the (1,A) EA over the EA in terms of tail bounds. Note that the tail bounds for 
the (1+1) EA on OneMax are well studied. In particular, the tail bound from (Lehre and Witt 


2014| implies that there exists such constant c > 0 that for any r > 0 and r < en In n — cn — ren 


holds Qn ^ < e r / 2 . 


6 Conclusions 


In this paper, we presented an approximating model of non-elitist mutation-based EA with 
tournament selection and obtained upper and lower bounds on proportion of sufficiently good 
genotypes in population using this model. In the special case of monotone mutation operator, 
the obtained bounds become tight in different situations. The analysis of infinite population 
EA with monotone mutation suggests an optimal selection mechanism that actually converts 
the EA into the (1,A) EA. 

Applications of the obtained general lower bounds give an exponentially vanishing tail 
bound for the Randomized Local Search on unimodal functions and new runtime bounds for 
the EAs on the 2-satisfiability problem and on a family of set covering problems proposed by 
E. Balas. 

It is expected that the further research will involve applications of the proposed approach 
to other combinatorial optimization problems, in particular, the problems with regular struc¬ 
ture. 

Most of the lower and upper bounds on expected proportions of genotypes, obtained in 
this paper, do not take the tournament size into account. It remains an open research question 
of how to construct the tighter bounds w.r.t. the tournament size. The subsequent research 
might benefit from joining the analysis of expectation of population vector with some variance 
analysis. 


It is of interest to compare the tail bounds established in Subsections |5.2| and 5.5 to the tail 
bounds obtainable using other techniques, e.g. ([Lehre and Witt 2014|. 

Another open question is how to incorporate the crossover operator into the approximating 
model. For some types of crossover operators, such as those based on solving the optimal re¬ 
combination problem (Eremeev and Kovalenko 2014), the lower bounds from this paper may 
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be easily extended, ignoring the improving capacity of crossover. It is important, however, 
to take the positive effect of crossover into account and it is not clear how the monotonicity 
conditions could be meaningfully extended for this purpose. 


Appendix. 


In this appendix, we reproduce two results from ( [Borisovsky and Eremeev 2001) 
and (Borisovsky 2001) which are used in Section [3] and a well-known result on eigenvalues 
of thridiagonal Toeplitz matrices. 

The algorithms (1,A) EA and (1+1) EA and probabilities P and Q^\ j = 1 ,m, t = 
0,1,... are defined as in Section[3] For the (1,A) EA and for the (1+1) EA we also define the 
vectors of probabilities: P (r ^ = (p{ r \P^') , Q 1 -^ = (q[ t \ ..., ■ 

The following Theorem|5]from (Borisovsky and Eremeev 2001) shows a superiority of the 
(1+1) EA over the (1,A) EA in the case of monotone mutation operator. For a fair comparison 
of the algorithms (1,A) EA and (1+1) EA here we allow both of them to make the same number 
of evaluations of the fitness function, equal to t A. 


Theorem 5 Suppose that the same monotone mutation operator Mut 7 is used in the (1+1) EA and in 
the (1,A) EA and Q (0) > P (0) . Then Q (a) > P w for any t > 0. 


The following theorem from (Borisovsky. 2001) compares the distribution of a fittest indi¬ 
vidual (ji ! in the EA population t over Lebesgue subsets compares to such a distribution of 
the (1,A) EA. Let us define a vector RW for the EA, analogously to vectors P’ 11 and Q ': 


R (t) := (Pr{ ff } 4) € -Hi},..., Pr{^ € H m }) . 


Theorem 6 Suppose that the EA and the (1,A) EA use the same monotone mutation operator Mut and 
R<°) < P (0 A Then for any t> 0 holds RW < P (t \ regardless of selection operator used in the EA. 


The original manuscript (Borisovsky 2001) is hardly accessible, therefore we provide the 
proof of Theorem [6]below. 

Proof. It is sufficient to consider the case of t = 1, since the statement for the general case 
will follow by induction on t. Let /A 1 ,fc ) denote a genotype with the highest fitness among the 


first k offspring of b and let be a genotype with the highest fitness among g [^,..., g 
in the EA population X 1 , for any k = 1,..., A. 

a) Let us first assume that b^°1 £ A, and gi 0> £ A, for some fixed i and let a genotype g’ be 
chosen by the selection operator of the EA. Then for arbitrary j = 1... to, in view of Proposi¬ 
tion [2] we have: 


Pr{Mut(</) g Hj\gi 0) £ A.J > Pr{Mut(6 (0) ) g Hfb^ £ AJ. (26) 

Note that Prjgi 1 ^ yL Hf\g^ £ A.;} > PrjfA 1 ) ^ Hfb^ £ A;}, which may be established by 
induction on fc = 1,..., A — 1 using the inequality 

Pr{<? (w) £ Hfg^ £ AJ = Pr {g^ Hfg^ £ A,} Pr{Mut( 5 7 ) g Hfgi 0) £ AJ 


> Pr{6 (1 ’ fc) £ Hj\b {0) £ Aj} Pr{Mut(6 (0) ) ef Hfb (0) £ A,} = Pr{6 (1 ’ fe+1) £ Hfb {0) £ AJ. 
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b) Let us prove that P (b > 

R^ for arbitrary initial distributions of the (1,A) EA and the 
EA, assuming P M) ' = R (0 ). We use the total probability formula and the conclusion of case a): 


Pr{^* 1) £ Hj} = ^ Hj\gi 0) G Ai}Pr{gi° } G A*} 

2=0 

m 

> ^Pr{6 (1) £ Hj |& (0) e Pr{& (0) e A t } = Pr{6 (1) £ Hj}. (27) 

2 = 0 


c) In general, when P : 0 ^ > Hi 0 ' let us note that according to Proposition 1 from (Borisovsky 


and Eremeev 


2001 ), in the case of monotone mutation for any t > 1 we can consider P /1 as 


the following function on vector P 


(t-i). 


P f = 1 - (1 - 7o j ) A + - (1 - 7ii) A )^ (t_1) , 3 = 1, • • •, m, (28) 


where 7 ,j are the cumulative transition probabilities of mutation operator Mut. We denote the 
relationship (28 ' by P ( ^ = F(P^ -1 )) for brevity Then due to nonnegativity of the multipliers 
of probabilities in (28', we conclude that P (1 ^ = P(P (0 ^) > P(R (0 )). Finally 

note that the result of case b) may be written as P(R^) > It,' 1 -*, therefore p(P > R(l). □ 


The following result on eigenvalues of thridiagonal Toeplitz matrices may be found e.g. 
in (Noschese et al.{[2013) . 

Theorem 7 Suppose an (n x n)-matrix T is composed of zero elements everywhere except for the 
diagonal elements, which equal S, the superdiagonal elements which equal r and subdiagonal elements 
which equal a. Then all of eigenvalues of T are given by 

Xh = S + 2 ^/Tjt cos- , h=l,...,n. 

ra+l 
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